PROBABILISTIC MATCHSIMILARITY MEASURE FOR DOCUMENT CLUSTERING
نویسندگان
چکیده
منابع مشابه
Privacy Preserving MFI Based Similarity Measure For Hierarchical Document Clustering
The increasing nature of World Wide Web has imposed great challenges for researchers in improving the search efficiency over the internet. Now days web document clustering has become an important research topic to provide most relevant documents in huge volumes of results returned in response to a simple query. In this paper, first we proposed a novel approach, to precisely define clusters base...
متن کاملA Novel Multi - Viewpoint based Similarity Measure for Document Clustering
Data mining is a process of analyzing data in order to bring about patterns or trends from the data. Many techniques are part of data mining techniques. Other mining techniques such as text mining and web mining also exists. Clustering is one of the most important data mining or text mining algorithm that is used to group similar objects together. In other words, it is used to organize the give...
متن کاملUsing a Wikipedia-based Semantic Relatedness Measure for Document Clustering
A graph-based distance between Wikipedia articles is defined using a random walk model, which estimates visiting probability (VP) between articles using two types of links: hyperlinks and lexical similarity relations. The VP to and from a set of articles is then computed, and approximations are proposed to make tractable the computation of semantic relatedness between every two texts in a large...
متن کاملXml Document Probabilistic Clustering Based on Structure and Content
Large volume of information is stored in XML format in the Web, and clustering is a management method for this documents. Most of current methods for clustering XML documents consider only one of these two aspects. In this paper, we propose SCEM (Expectation Maximization Structure and Content) for XML documents which is used to effectively cluster XML documents by combining content and structur...
متن کاملAffinity-Based Probabilistic Reasoning and Document Clustering on the WWW
The World Wide Web (WWW) has become one of the fastest growing applications on the Internet today. More and more information sources have linked online through WWW, but finding information on the WWW is also a great challenge. For most of the users, the information retrieved is not well organized and the access time is considered high on the WWW currently. Therefore, there is a need to develop ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal on Information Sciences and Computing
سال: 2015
ISSN: 0973-9092
DOI: 10.18000/ijisac.50156